🎉 Wipro's Lab45 AI Platform is now free with your Topcoder account!

May 5, 2014

SRM 619

[[rng_58]][[rng_58]]

DURATION

30 min

Match Summary

Round Overview
Discuss this match

GoodCompanyDivTwo

Problem Details

Used as: Division Two - Level One:


Value	250
Submission Rate	600 / 834 (71.94%)
Success Rate	526 / 600 (87.67%)
High Score	rounak_patni for 249.76 points (0 mins 53 secs)
Average Score	172.51 (for 526 correct submissions)

This is an implementation problem. First of all, each employee `i` makes a department. We need to extract the department given the employee. This means making a list that includes `i` and also any `j` such that: `“superior”[j] = i`. The constraints guarantee that this should be good enough.

For each department, we need to verify that the work types assigned to each member in the department are unique. If so, then the department is diverse and we should increment the count. One way, like in the following c++ solution is to simply use an `O(m^2)` loop (where `m` is the number of employees in the department).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
int countGood(vector < int > superior, vector < int > workType) {
  int c = 0;
  int n = superior.size();
  // for each employee i:
  for (int i = 0; i < n; i++) {
    // find the members of the department
    vector < int > department(1, i); // i is in the department
    for (int j = 0; j < n; j++) {
      if (superior[j] == i) {
        department.push_back(j);
      }
    }
    // check if there are no duplicate work types:
    int m = department.size();
    bool diverse = true;
    for (int i = 0; i < m; i++) {
      for (int j = i + 1; j < m; j++) {
        if (workType[department[i]] == workType[department[j]]) {
          diverse = false;
        }
      }
    }
    if (diverse) {
      c++;
    }
  }

  return c;
}

We can also get clever, for example, in the following python solution, we compare if converting the list to a set (and thus removing duplicates) yields the same length as the original list (if this is true, then there were no duplicates.

1
2
3
4
5
6
7
8
9
10
11
12
class GoodCompanyDivTwo:
  def countGood(self, superior, workType):
  def isDiverse(department):
  works = [workType[i]
    for i in department]
return len(works) == len(set(works))

def getDepartment(i):
  return [i] + [j
    for (j, x) in enumerate(superior) if x == i]

return sum(isDiverse(dep) for dep in [getDepartment(i) for i in range(len(superior))])

ChooseTheBestOne

Problem Details

Used as: Division Two - Level Two:


Value	500
Submission Rate	379 / 834 (45.44%)
Success Rate	234 / 379 (61.74%)
High Score	mwc123 for 491.33 points (3 mins 47 secs)
Average Score	303.21 (for 234 correct submissions)

This is a simulation problem. Imagine we have a list of t-shirt numbers from 1 to `n`. We will remove a t-shirt from the list `n-1` times. Even if we needed `O(n)` processing time to remove the t-shirt, we would still have an `O(n^2)` solution.

The trick is how to do the selection and removal of each t-shirt in `O(n)` time. Imagine we knew the index in the current list (possibly after some removals) was `i` and the current number of t-shirts was `m`. In this case we will use an indexed list. This is step `t`, so we will rotate `i` count up to `t^3` times before meeting the t-shirt to remove. The list is cyclic because the t-shirts are in a circle. What is the index of the t-shirt that will be removed? Imagine if `i + t^3 - 1 < m`, then the index would be `i + t^3 - 1`, but if `x = i + t^3 - 1 >= m`, then we trespassed the end of the cyclic list. We should subtract `m` from `x`, if `x` is still >= m, then we subtract `x` again. And again. This is the same as using the division algorithm and it means that we want to find the remainder of dividing `i + t^3 - 1` by `m`. Thus the index is: `(i + t ^3 - 1) % n`. We can find this in `O(1)` time. Just be careful, since `t` can be up to 4999, then `t^3` will overflow a 32 bits integer, so we should use a 64 bits one to do this operation.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
int countNumber(int N) {
  vector < int > tshirts(N);
  for (int i = 1; i <= N; i++) {
    tshirts[i - 1] = i;
  }
  int i = 0;
  // note we use a long (64 bits in TC servers)
  for (long t = 1; t < N; t++) {
    int m = tshirts.size();
    // find the index of the t-shirt to remove:
    i = (int)((i + t * t * t - 1) % m);
    // remove i-th tshirt: 1) shift left.
    for (int j = i; j < m - 1; j++) {
      tshirts[j] = tshirts[j + 1];
    }
    // resize list to remove the last element: 
    tshirts.resize(m - 1);
  }
  return tshirts[0];
}

1
2
3
4
5
6
7
8
9
class ChooseTheBestOne:
  def countNumber(self, N):
  tshirts = range(1, N + 1)
i = 0
for s in xrange(1, N):
  i = (i + s ** 3 - 1) % len(tshirts)
# remove i - th
tshirts = tshirts[: i] + tshirts[i + 1: ]
return tshirts[

EmployManager

Problem Details

Used as: Division Two - Level Three:


Value	950
Submission Rate	55 / 834 (6.59%)
Success Rate	24 / 55 (43.64%)
High Score	simenl for 853.37 points (9 mins 47 secs)
Average Score	587.05 (for 24 correct submissions)

The initial state

It is very helpful to assume that initially, we haven’t hired anyone. In this initial state, we lose all the values in the earnings pairs. So initially the result is possibly negative. From this perspective, hiring a employee will increase the profit by some of the earnings values.

It is interesting, consider the first employee `i` we decide to hire. This means that all the earnings pairs in which `i` is a member of will no longer be a penalty (one candidate , `i` will no longer contribute with any of those other candidates `j` to bring down our profits by earnings[i][j]. This actually translates into us increasing the profit by the sum of all the earnings in the `i` column (or the `i` row).

Even more interesting is what happens with the second employee we hire. Let’s say the second employee is `k`. There are some earnings that are still penalties, so by hiring `k` our profits will no longer be penalized by earnings[k][j] for each `j != i` (`i` is still the first hired employee). However, because we now have both `i` and `k` hired, then we get a bonus equal to earnings[i][k] = earnings[k][i]. In total, the profit is increased also by the sum of all the earnings in column `k`, because we first ignore earnings[k][i], but then we add it again.

It appears that whenever we hire an employee `x`, the profit increases by the sum of earnings in column `x` and decreases by value[x]. If this is true, then the solution is very simple: Whenever value[x] is less than the sum of earnings in column `x`, then it is best to hire candidate `x`.

Is the property true? Let’s say we hire candidate `x` and there is a set `S` of employees we already hired. We know that because the penalty is cancelled, then the profits will increase by earnings[x][i] for each `i !in S`. Also, for each `i in S`, the pair of candidates `(i,x)` is now hired and thus we should increase profit by earnings[x][i] for each `i in S`. In total, we increase by earnings[x][i] for each `i: (i in S) “or” (i !in S)`. This really means that we should use all `i` values. Which means we add all values in the column.

Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
int maximumEarnings(vector < int > value, vector < string > earning) {
  int res = 0;
  int n = value.size();
  for (int i = 0; i < n; i++) {
    for (int j = 0; j < i; j++) {
      res -= (earning[i][j] - '0');
    }
  }
  for (int i = 0; i < n; i++) {
    int col = 0;
    for (int j = 0; j < n; j++) {
      col += (earning[j][i] - '0');
    }
    if (col > value[i]) {
      res += col - value[i];
    }
  }
  return res;
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class EmployManager:
  def maximumEarnings(self, value, earning):
  # turn earning into a table of integer numbers:
  earning = [[ord(y) - ord('0') for y in x]
    for x in earning]
# start with the sum of -earning[i][j]
for each i < j
res = -sum(sum(earning[i][j]
  for j in range(i)) for i in range(len(earning)))
#
for each candidate i,
if value[i] is less than the sum of its column, hire:
  for (i, vi) in enumerate(value):
    e = sum(earning[j][i]
      for j in range(len(earning))) - value[i]
if e > 0:
  res += e
return res

SplitStoneGame

Problem Details

Used as: Division One - Level One:


Value	250
Submission Rate	667 / 681 (97.94%)
Success Rate	504 / 667 (75.56%)
High Score	one_one_onto for 247.79 points (2 mins 41 secs)
Average Score	202.66 (for 504 correct submissions)

Each step will decrease the total number of piles by 1. A good starting point of this problem is to consider the smallest cases first.

The small cases, with `n <= 2` are losing states - If a player is given these states as a starting point, they lose. There must be at least 3 piles in order for there to be a valid step. Try `n = 3`.

One `n = 3` is a losing state: If all piles have 1 stone, then there are no possible moves.
Any other case with `n = 3`, will be a winning state. Any move will reduce the number of piles by 1, and we know the cases with `n <= 2` are losing states. So the next player will lose and the current player will win.

Now try with `n = 4`. We know that almost all `n = 3` states are winning states. In order to win `n = 4`, the current player must make a move that takes the game to the only losing state with `n = 3`, the one with only size 1 piles. However, this is simply impossible. A valid move requires you to pick three piles: `X, Y, Z`. Pile `X` is split into two piles of size greater than 0. Then the two piles are added to `Y` and `Z`, meaning that the sizes of `Y` and `Z` will increment by at least 1. `Y` and `Z` initially had a size of at least 1. This means that after any valid move, at least two piles will have a size larger than or equal to 2. It is impossible to win when `n = 4`.

For `n = 5`, any valid move will take the game to `n = 4`, and we already know that all states with `n = 4` are losing states. This means that, if there is at least one valid move, then `n = 5` is a winning state. The only exception is the one where all piles have size 1.

At this point, things become repetitive. We can generalize: (This can be proven formally using a proof by induction, using the same logic we used for the previous steps).

If `n` is even, then the next player to move will lose.
If `n <= 2` or all piles have size 1, then the next player will lose.
In any other case, the next player wins.

Code

1
2
3
4
5
6
7
8
string winOrLose(vector < int > number) {
  int n = number.size();
  if ((n % 2 == 0) || (n <= 2) || (number == vector < int > (n, 1))) {
    return "LOSE";
  } else {
    return "WIN";
  }
}

GoodCompanyDivOne

Problem Details

Used as: Division One - Level Two:


Value	600
Submission Rate	165 / 681 (24.23%)
Success Rate	57 / 165 (34.55%)
High Score	cgy4ever for 428.03 points (19 mins 45 secs)
Average Score	284.54 (for 57 correct submissions)

In a root

The superior list in the input generates a forest. Each node may have children and the parent-children relationship in this forest means that the parent node is the boss of the children node.

Let’s focus on the root of a tree in this forest. This root and its children make a department. Let’s say the size of the department is m. The department needs to be able to be diverse. This means that there should exist at least one set of skills of length m such that each of the skills in the set can be used by a different member of the department.

Let’s focus on the way to assign the skills of this set to the members of the department. Member 0 (the root) gets a skill `s_0`. Then we have skills `s_1`, `s_2`, …, `s_(m-1)` for each of the other members of the department. The skills are each unique. We will call `s_0, s_1,…, s_(m-1)`. We should try to find the sequence `s_0, s_1,…,s_(m-1)` that yields the minimum cost possible. But we would first need to know how to find the cost of a given sequence. The sequence determines the cost in this way: One of the two skills assigned to employee `i` is `s_i`.

For the root, it is simple, we know the cost to train skill `s_0`. The remaining skill can be any other skill different to `s_0`. Out of the `k-1` skill options, we should pick the cheapest one.
For the other employees, it is more complicated. For each `i`, we need to find the cost to do the remaining assignments in the sub-tree rooted at node `i` if one of the skills assigned to node `i` is `s_i`. We might need to pick a specific second skill for node `i` to minimize the remaining assignments. We have found a very important sub-problem: Can you solve function `f(x,p)` that finds the minimum cost to do the assignments in the sub-tree rooted at `x` such that one of the skills assigned to `x` is `p`? Note that if we solve `f(x,p)`, we can use it to solve for the root too, just try each of the `O(k)` options for `p` and pick the minimum `f(x,p)`.

The subproblem

Let’s solve `f(x,p)` for any node `x`. We can find the members of the department. `x` gets index 0 and the remaining members get indices from 1 to `m-1`. For each children `y` of `x`, we assume that we already know `f(y,p)` for any skill `p`.

We have to pick the optimal sequence `s_0, s_1, … s_(m-1)` of skills that will be used to make this department diverse. The idea that solves this problem is that for each pair of employee `i` and skill `j` , we can find the cost `c_(ij)` to make `s_i = j`.

For `i > 0` , we know that `c_(ij) = f(y_i, j)`. Where `y_i` is the i-th child of `x`.
For `i = 0`, it is different, because we know that the root must have skill `p` as one of its two skills. There are two possibilities:
- We let `s_0 = p`, meaning that `p` is part of the special sequence. We should still pick the second skill for the sub-tree root. It is best to pick the cheapest training different to `p`.
- Or we can have `s_0 = j != p`, the `p` is the second skill but we need `“training”[j]` cost units.

Finally, we have a instance of this problem: A table of costs `c_(ij)`, and we need to assign each `i` to a distinct `j`. This is the assignment problem. We can solve it using a minimum cost bipartite matching, which can be solved with a minimum cost max flow algorithm.

The base case of `f(x,p)` will be reached when `x` is a leaf. This means there are no children of `x` and the minimum cost is equal to finding the cost of training skill `p` and the cheapest other skill.

Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
const int INF = 1000000000;
vector < int > training;
vector < int > superior;
int n, k;
int dp[30][31];
int f(int x, int p) {
  int res = dp[x][p];
  if (res == -1) {
    // Get the department led by x
    vector < int > y(1, x);
    for (int j = 0; j < n; j++) {
      if (superior[j] == x) {
        y.push_back(j);
      }
    }
    int m = y.size();
    int table[m][k];
    table[0][p] = INF; //pick a second skill that is any skill other than p
    for (int i = 0; i < k; i++) {
      if (i != p) {
        table[0][p] = std::min(table[0][p], training[i]);
        table[0][i] = training[i]; //the cost of the other skill
      }
    }
    // The costs for the children assignments
    for (int j = 1; j < m; j++) {
      for (int i = 0; i < k; i++) {
        table[j][i] = f(y[j], i);
      }
    }
    // Solve the assignment problem using the costs in the table.
    //
    // We will use a min cost max flow solver implemented as a
    // "network" class. The network in this case has some methods:
    // - addVertex() adds a vertex to the graph, returns its index
    // - addEdge(u, v, capacity, cost) adds an edge connecting vertices
    //                                 with indices u and v
    //                                 with specific capacity and cost.
    // - minCostMaxFlow() find the minimum cost max flow and saves flow
    //                    and cost in attributes: totalFlow and totalCost
    unique_ptr < MinCostFlow::network > g(new MinCostFlow::network());
    for (int i = 0; i < m + k; i++) {
      g - > addVertex();
    }
    g - > source = g - > addVertex();
    g - > sink = g - > addVertex();
    for (int i = 0; i < m; i++) {
      g - > addEdge(g - > source, i, 1, 0);
    }
    for (int i = 0; i < k; i++) {
      g - > addEdge(m + i, g - > sink, 1, 0);
    }
    for (int j = 0; j < m; j++) {
      for (int i = 0; i < k; i++) {
        g - > addEdge(j, m + i, 1, table[j][i]);
      }
    }
    g - > minCostMaxFlow();

    // If the max flow is less than m, there are too many employees in
    // the department for too little skill choices. Impossible.
    if (g - > totalFlow != m) {
      res = INF;
    } else {
      res = g - > totalCost + training[p];
    }
    dp[x][p] = res;
  }
  return res;
}

int minimumCost(vector < int > superior, vector < int > training) {
  this - > superior = superior;
  this - > training = training;
  memset(dp, -1, sizeof(dp));
  n = superior.size();
  k = training.size();
  int res = 0;
  for (int i = 0; i < n; i++) {
    if (superior[i] == -1) {
      int m = INF;
      for (int j = 0; j < k; j++) {
        m = std::min(m, f(i, j));
      }
      res += m;
    }
  }

  return (res >= INF) ? -1 : res;
}

SimilarSequencesAnother

Problem Details

Used as: Division One - Level Three:


Value	1000
Submission Rate	2 / 681 (0.29%)
Success Rate	0 / 2 (0.00%)
High Score	null for null points (NONE)
Average Score	No correct submissions

Longest Common Subsequence

Let’s begin by noticing that the problem asks us to count the number of pairs sequences of length n such that their Longest Common Subsequence (LCS) has length `n-2`, `n-1` or `n`. Thus: `|A|=n`, `|B|=n` and `LCS(A,B) >= n - 2`.

It’s useful to focus on how the LCS problem is solved. Let’s use the traditional `O(n^2)` dynamic programming to find the length of the LCS:

dp\[i\]\[j\] = 0, if (i &lt; 0) or (j &lt; 0)
dp\[i\]\[j\] = dp\[i\-1\]\[j\-1\] + 1, if A\[i\] = B\[j\]
dp\[i\]\[j\] = max{dp\[i\-1\]\[j\], dp\[i\]\[j\-1\]}, if A\[i\] != B\[j\]

We should delete at most 2 elements of `A` and 2 elements of `B`. When we do a move to `dp[i-1][j]` or `dp[i][j-1]`, it is the same as deleting `i-1` or `j-1`. Initially, `i = j = n - 1`. Imagine we repeat the move to `[i-1][j]` thrice: We reach `i = j - 3`, if we repeat the move to `[i][j-1]`, thrice, we reach: `j = i - 3`. Reaching any of these cases means that we deleted far too many elements from either `A` or `B`. It follows that when filling the dp table, we should only use cells in which `|i-j|>=2`. So let’s mark the other positions as invalid. When solving the dp, we should never use those values. This is the condition the pair of sequences should fulfill.

Consider `A = [0, 1, 3, 2]` and `B = [1, 4, 4, 3]`. Their LCS is 2. So let’s simulate the table.

    \A
    B\  - 0 1 3 2
      ----------
     -| 0 0 0 0 0
      |
     4| 0 0 1 1 1
      |
     4| 0 0 1 1 1
      |
     4| 0 0 1 1 1
      |
     3| 0 0 1 2 2

When filling this from `[3][3]`, you can see that we never need the actual values of the invalid cells. We can just assume that these cells have a very small value and therefore, we can ignore them

    \A
    B\  - 0 1 3 3
      ------------
     -| 0 0 0 * *
      |
     4| 0 0 1 1 *
      |
     4| 0 0 1 1 1
      |
     4| * 0 1 1 1
      |
     3| * * 1 2 2

A recurrence

A recurrence idea goes as follows: Fill the two sequences by selecting a value for both A[i] and B[i] and then progressing to A[i+1] and B[i+1]. When picking the values of A[i] and B[i], you are filling column and row i of the table.

       i i i i
   \A  - - - -
   B\\  4 3 2 1 i
      ----------
  i-4| x x x * *
     |
  i-3| x x x x *
     |
  i-2| x x x x ?
     |
  i-1| * x x x ?
     |
  i  | * * ? ? ?

It is important that dp[i][i-3] and dp[i-3][i] are never used when filling dp. Since this is all that we need to fulfill, there are many values of the table we can ignore:

We need to find the values for column `i` and row `i`, then there are only some few values of the table and values of `A[]` and `B[]` that we need to remember:

       i i i i
   \A  - - - -
   B\  4 3 2 1 i
      ----------
  i-4| - - - - -
     |
  i-3| - - - x *****
     |
  i-2| - - - x ?
     |
  i-1| - x x x ?
     |
  i  | - ***** ? ? ?

E.g: To find dp[i][i-2] we might need the values of dp[i-1][i-3] and dp[i-1][i-2]. The values of `A[i-2]` and `B[i]` also affect the result. We assume dp[i][i-3] is very small and can ignore its value.

From this we can see that when filling `i`, there are only 9 variables that we need to know: `A[i-1], A[i-2], B[i-1], B[i-2]`, dp[i-1][i-3], dp[i-1][i-2], dp[i-1][i-1], dp[i-3][i-1] and dp[i-2][i-1]. For simplicity, we will call them: C1,C2,C4,C5, V1, V2, V3, V4 and V5. We want to find C3 and C6 (Values of `A[i]` and `B[i]` and also W1, W2, W3, W4 and W5 (Values of column/row `i`).

. . c4 c5 c6
.
. V5
c1 V4 W5
c2 V1 V2 V3 W4
c3 W1 W2 W3

We can count the number of ways to fill the remaining positions greater than or equal to `i`, given the values of `c_j` and `V_j`. The idea is to use brute force for the values of `c_3` and `c_6`, find the `W_j` values and then transition to a new state (larger i). The requirement is that, at the end, dp[n-1][n-1] >= n - 2. We just need some ideas to cut down the number of states.

We do not need the actual values

The worst issue at the moment is that the `C_j` values can be quite large (up to bound = 1000000000. Just note that when filling W1, W2, …, we only need to find if the pairs of `(C_p,C_q)` are equal or not. This means that many different sequences [C1,C2,C4,C5] will yield the same ultimate result. So let C1 = 0, then C2 = 0, if C1 = C2 or 1 otherwise. C3 = 0, if C3 = C1 or 1 if C3 = C2, etc. This will greatly reduce the number of states: There are only 15 valid sequences C1,C2,C4,C5. It also optimizes the brute force for C3 and C6: You only have 5 options for C3 (Can be equal to C1,C2,C4 or C5 or a new number altogether) and up to 6 options for C6. Just make sure to multiply proper factors.

Ranges for dp values

Another issue is that the values V1,V2,V3,V4,V5 can each be up to 100.

For a given `(i,j)`, the maximum `dp[i][j] = min(i,j) + 1`.

Imagine the minimum `dp[i][j]`. If we want this value to be relevant, it should contribute to `dp[n-1][n-1] >= n - 2`. Every time the initial `dp[i][j]` is incremented, so should `i` and `j` be incremented. The maximum value by which `dp[i][j]` can be incremented is: `min(n-1-j,n-1-i)`. So we have `dp[i][j] + min(n-1-j,n-1-i) >= n - 2` or `dp[i][j] >= n - 2 - min(n-1-j,n-1-i)`. Assume `(j <= i)`: `n - 2 - n + i = i - 2`. If we assume `(i <= j)`, then we have `j - 2`. In short, the minimum value of dp[i][j] should be `max(i,j) - 2`.

We can encode each dp[i][j] by using a number `x` such that : `dp[i][j] = max(i,j) - 3 + x`. If `x <= 0` then `dp[i][j]` is too small, and it doesn’t matter if we remember the value of x as 0 or -1, or -2. Since `dp[i][j]` is at most `min(i,j) + 1`, then `x` will never exceed 3. By remembering just the value of `x` for V1, V2, V3, V4, V5 we have greatly simplified the number of states.

The implementation

In total the two optimizations lead to a maximum of `4^5 * 15 * 100 = 1536000` states.

The implementation will be complicated as we need to encode the states and also make sure the values of C1,C2,C3 and C4 always follow the rules we set. A way to simplify it is to start with `i = 0` assuming there were two previous indexes (-1 and -2), in which the columns and rows are full of zeros.

Chat on Discord

May 5, 2014

SRM 619

DURATION

categories

Tags

share

Join Topcoder Challenges

Match Summary

GoodCompanyDivTwo

ChooseTheBestOne

EmployManager

The initial state

Code

SplitStoneGame

Code

GoodCompanyDivOne

In a root

The subproblem

Code

SimilarSequencesAnother

Longest Common Subsequence

A recurrence

We do not need the actual values

Ranges for dp values

The implementation